A Study of Acoustic Features for Emotional Speaker Recognition in I-vector Representation
نویسندگان
چکیده
Recently recognition of emotions became very important in the field of speech and/or speaker recognition. This paper is dedicated to experimental investigation of best acoustic features obtained for purpose of gender-dependent speaker recognition from emotional speech. Four feature sets LPC (Linear Prediction Coefficients), LPCC (Linear Prediction Cepstral Coefficients), MFCC (Melfrequency Cepstral Coefficients) and PLP (Perceptual linear prediction) coefficients were compared in an experimental setup of speaker recognition system, based on i-vector representation. For evaluation of the system emotional speech recordings from newly created Slovak emotional database and Mahalanobis distance metric as scoring method were used. The results of the experiment showed the MFCC representation as the best fitted for speaker verification from Slovak emotional speech with recognition rate higher than 80%.
منابع مشابه
A Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملSpeaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles
Herein we present a comparison of novel concepts for a robust fusion of prosodic and verbal cues in speech emotion recognition. Thereby 276 acoustic features are extracted out of a spoken phrase. For linguistic content analysis we use the Bag-of-Words text representation. This allows for integration of acoustic and linguistic features within one vector prior to a final classification. Extensive...
متن کاملAcoustic detection of apple mealiness based on support vector machine
Mealiness degrades the quality of apples and plays an important role in fruit market. Therefore, the use of reliable and rapid sensing techniques for nondestructive measurement and sorting of fruits is necessary. In this study, the potential of acoustic signals of rolling apples on an inclined plate as a new technique for nondestructive detection of Red Delicious apple mealiness was investigate...
متن کاملUsing genetic algorithms to weight acoustic features for speaker recognition
The Mel-Frequency Cepstral Coefficients (MFCC) are widely accepted as a suitable representation for speaker recognition applications. MFCC are usually augmented with dynamic features, leading to high dimensional representations. The issue arises of whether some of those features are redundant or dependent on other features. Probably, not all of them are equally relevant for speaker recognition....
متن کامل